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A d ata management system (100) and method for storing and 
communicating different types of data allowing variable lengths 
and hierarchical nesting of data records. Hierarchical structure 
is implicidy defined by relationships of the lengths fields of data 
records. The system (100) and method use data-containing records 
to store Hfifa and container records to contain other records in order 
to define the hierarchical structure and thereby greatly facilitate the 
movement and management of the stored data (107). 
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DATA MANAGEMENT T JSTNC NES TED RECORDS AND CODE POINT 



Background of the Invention 

1. Field of Hip Invention 

This invention relates to distributed data management and, more par- 
5 ticularly, to a persistent index storage system and method allowing storage and 
manipulation of data using hierarchical nested records and code points. 

2. Description of tb «* Related Art 

Conventional distributed processor systems facilitate execution of several 
different processes simultaneously on different processors. Systems and methods 

10 for exchanging data among these processors exist in the prior art (see U.S. patent 
number 5,056,003 to Hammer et al. for Distributed Data Management 
Mechanism). Such systems and methods present techniques for reducing storage 
requirements by splitting a logical data stream into segments residing at different 
processor storage locations. Descriptor elements indicate the location and length 

15 of the various segments so that they can be reassembled into the complete logical 
data string. 

Hammer et al. and related techniques suffer from several disadvantages. 
One disadvantage is that hierarchical data structures are not generally supported; 
if the data stream contains nested records, there is no built-in technique for 

20 navigating through the data structure. Thus, individual applications are required 
to deal with the hierarchical structure, as they cannot rely on the data 
management system to do so. This adds to programming overhead. Another 
disadvantage is that logical data strings have a fixed maximum length due to the 
fixed and limited space available for the descriptor element specifying length. 

25 Finally, existing systems generally provide only a mechanism for communicating 
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and passing data strings among processors, but do not provide a persistent 
storage system. ' . 



Summary of the Invention 
5 In accordance with the present invention, there is provided a system and 

method of data management permitting storage and communication of different 
types of data, and allowing hierarchical nesting of data records and variable 
lengths of records. Hierarchical structure is implicitly defined by relationships of 
length fields of data records. Thus, movement and management of data stored 

10 according to the structure of the present invention is greatly facilitated. 

Individual applications are not required to deal with navigation through the 
hierarchical structure, since the data management system does so automatically. 
In addition, the data structure provides for movement of "chunks" of records that 
have a hierarchical relationship with one another. Finally, the system of the 

1 5 present invention includes format information to facilitate data transfers between 
applications employing different data formats. 

The data structure defined by the present invention includes two types of 
records: data-containing records and container records. Data-containing records 
contain data, while container records contain other records. A code point found 

20 at the beginning of each record specifies its type. A length field facilitates variable 
data lengths for data-containing records, as well as implicit definition of a 
hierarchical structure among records. A format field facilitates movement of data 
among environments having differing data formats. Code point definitions may 
be system-wide, or application-specific, depending on the level of the particular 

25 record in the overall hierarchy. 

In defining the hierarchical relationship among records implicitly by virtue 
of the various length fields, the data structure provides the above-recited 
advantages without requiring large amounts of data storage. 
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BRIEF DES CRIPTION OF THE DRAWINGS 

Figure 1 is a block diagram of a preferred hardware environment for tht 
present invention. 

Figure 2 is a diagram of the data formats used to store container records 
5 and data-containing records in the preferred embodiment. 

Figure 3 is a diagram of a sample string of data bytes according to the 

preferred embodiment. 

Figure 4 is a flowchart showing the process of deleting a record. 
Figure 5 is a flowchart showing the process of adding a record. 
10 Figure 6 is a diagram showing an example of a hierarchically-arranged 

collection of data fields suitable for storage according to the present invention. 
Figure 7 is a diagram showing an example of a hierarchical relationship 

among data fields. 

Figure 8 is a flowchart showing the process of inserting a record. 

15 DESCRIPTION OF THE PREFERRED EMBODIMENT 

Referring now to Figure 1, there is shown a block diagram of a preferred 
hardware environment for the present invention. In the preferred embodiment, 
computer 100 may be a conventional IBM-compatible personal computer 
including standard communication interface hardware 101, Intel processor 102, 
20 input device 103 such as keyboard and/or mouse, output device 104 such as 

display screen and/or printer, and random access memory (RAM) 105. RAM 105 
is divided into several sections, each of which will be described below. 

Application programs 106 are stored in RAM 105 in a conventional 
manner. Data storage 107 stores data for use by the computer, including data 
25 objects 110 organized according to the code point structure of the present in- 
vention, as described more fully below. Some of the data may be stored on disk 
other media (not shown), and may be swapped into RAM 105 as needed, 
ding to conventional means. Code point library 108 contains the software 
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routines for accessing data storage 107, including navigation through the data * 
structure, interfacing, and indexing. Code point library 108 contains instructions 
for processor 102 for reading and modifying record lengths, and for performing 
operations on the stored data as described below. In response to calls from 
5 application programs 106, code point library 108 contains instructions for 
processor 102 to return record information and update the stored data ac- 
cordingly. 

Lookup tables 109 provide definitions for various code points, so mat 
application programs 106 can interpret the data stored in data storage 107. A 
separate lookup table 109 may be provided for each application program 106, so 
that the application program 106 need only be concerned with a particular subset 
of the code point definitions. 

Hierarchical Data 

Referring now to Figure 6, there is shown an example of a hierarchically- 
arranged collection of data records suitable for storage according to the 
techniques of the present invention. For purposes of illustration, COMPANY 
record 600 is shown containing several records of information. All of the records 
contained within COMPANY record 600 relate to a single company. As can be 
seen, the records contained within COMPANY record 600 are arranged in a 
20 hierarchical structure. 

Specifically, COMPANY record 600 contains one COMPANY DATA 
record 601 and two REGION records 602. COMPANY DATA record 601 de- 
scribes the company as a whole, while REGION records 602 each describe a re- 
gion associated with the company. Each REGION record 602 is further subdi- 
25 vided into a REGION DATA record 604 and two DISTRICT records 605. Each 
REGION DATA record 604 describes the region as a whole, while DISTRICT 
records 605 each describe a district associated with the region. Each DISTRICT 
record 605 is further subdivided into a DISTRICT DATA record 606 and two 
STORE DATA records 607. Each DISTRICT DATA record 606 describes the 



15 
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district as a whole, while STORE DATA records 607 each describe a store asso- 
ciated with the district. ' 

Thus, a hierarchical structure emerges. Each individual record contained 
within COMPANY record 600 may be classified as either a container record or a 

5 data-containing record. Container records contain other records, while data- 
containing records contain data. No record is both a container record and a data- 
containing record. Record 600 is a container record, as are REGION records 602 
and DISTRICT records 605. COMPANY DATA record 601 is a data-containing 
record, as are REGION DATA records 604, DISTRICT DATA retards 606, and 

10 STORE DATA records 607. Thus, it can be seen that the container records serve to 
define and maintain the hierarchical relationship among the data-containing 
records. 

Implicit in the hierarchical structure shown in Figure 6 is a series of nested 
"levels" of data. COMPANY record 600 is a Level 0 record, since no records 

15 contain it. In all code point structures according to the preferred embodiment, the 
first record, typically designated as Level 0, is always a container record. 
COMPANY DATA record 601 and REGION records 602 are Level 1 records, since 
they are each contained by one record. REGION DATA records 604 and 
DISTRICT records 605 are Level 2 records, since they are each contained by two 

20 records. DISTRICT DATA records 606 and STORE DATA records 607 are Level 3 
records, since they are each contained by three records. 

Referring now to Figure 7, there is shown another example of a hierarchical 
relationship among records. Record 700 contains several hierarchically-organized 
records. Container records 701 contain other records, while data-containing 

25 records 702 contain data. 

Overall Data Structure 

The system and method of the present invention provide techniques of 
efficient storage and transmission of hierarchically-organized data of the type 
exemplified by Figures .6 and 7. 
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The following table is a representation of an example of a portion of 
memory that is organized according to the hierarchical code point structure of the 
present invention. Each record in the table corresponds to one of the records of 
the example of Figure 6, described above. 





Record Nsme 


Length 


1-6 


Company 


447 


7-33 


Company Data 


27 


34-39 


Region 


207 


40-66 


Region Data 


27 


67-72 


District 


87 


73-99 


District Data 


27 


100-126 


Store Data 


27 


127-153 


Store Data 


27 


154-159 


District 


87 


160-186 


District Data 


27 


187-213 


Store Data 


27 


214-240 


Store Data 


27 


241-246 


Region 


207 


247-273 


Region Data 


27 


274-279 


District 


87 


280-306 


District Data 


27 


307-333 


Store Data 


27 


334-360 


Store Data 


27 


361-366 


District 


87 


367-393 


District Data 


27 


394-420 


Store Data 


27 


421-447 


Store Data 


27 



5 

The above table shows the contents of a contiguous block of memory that is 
447 bytes long. For purposes of illustration, the first column of the table is 
indented according to the level of the record. Thus, the hierarchical relationship 
among the records is shown. 
0 In the table, "Memory location" indicates the location of each record within 

the block of memory. In the interest of simplicity for this example, all container 
records occupy six bytes of memory and are shown in boldface. All data- 
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containing records occupy 27 bytes of memory and are shown in a normal 
typeface. In the preferred embodiment, container records and data-containing 
records may vary in length, as will be described in more detail below. 

For each record, there is a "Record Name" indicating the name of the 

5 corresponding record in Figure 6. Each record also has a "Length" that indicates 
the total length of the record itself plus all records it contains. Thus, for example, 
the "Region" record that is stored at memory location 34-39 has a "Length" of 207, 
indicating that it contains all records residing at memory locations 34 through 240. 
In general, a record having a starting memory location of X and a "Length" of L 

10 contains all records residing at memory locations numbered X through X+L-l. By 
definition, a data-containing record has a "Length" equal to the actual length of 
the record itself, since a data-containing record cannot contain another record. 

Thus, the placement of each record in memory, along with the "Length" 
values, define the hierarchical relationship among the records. 

15 These hierarchical relationships permit rapid navigation through the data 

structure. If a long record is encountered that is of no interest to a particular 
process, the process can skip over the record (and any records nested within) by 
reading the "Length" and stepping forward by a number of bytes equal to the 
"Length". When stepping forward in this manner, the process need not read the 

20 skipped-over record into memory; only the "Length" of the first record need be 
read. 

The hierarchical structure also permits creation of an object, or record, that 
contains information to be used by many different application programs 106 on 
the computer. Thus, for example, a level one record might represent an envelope 
25 to be handled by an electronic mail application, and might contain a number of 
level two records. Each level two record might be associated with a different 
application program 106, with the associations being defined by the code points. 
Each application program 106 need only be aware of the code point definitions for 
its associated records. 
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Finally, the hierarchical structure permits moving and copying of groups of 
nested records, or "chunks", at any level in the hierarchy. When performing an 
operation such as copying or moving a record, the system reads the length field 
and performs the operation on a continuous block of memory having the 
5 indicated length. Thus, in one operation, a record and all those records contained 
within it may be moved or copied. 
Data Format for Records 

Referring now to Figure 2, there is shown a diagram of the data formats 
used to store container records and data-containing records in the preferred 
10 embodiment. Container record 200 includes two fields: 
o code point 202; and 
p length 203. 

Code point 202 is two bytes (16 bits) long. Three bits are used to specify 
the length of length field 203, which can vary from zero (indicated by 000) to four 
15 bytes (indicated by 100). Eleven bits are used to specify a user-defined code that 
will be interpreted by the application program 106 making use of the invention. 
One bit is used to indicate that this is a container record (as opposed to a data- 
containing record). The final bit is free. 

Length 203 is from zero to four bytes long. The length of the length field 
20 203 is determined by a portion of code point 202, as described in the previous 

paragraph. For container records, the length field 203 specifies the total length of 
all records that the current record contains (as described above). Providing 
variable length for this field increases the flexibility of the data format. Since the 
length field is up to four bytes long, the value of length 203 can range from zero to 
25 FFFFFFFF (hex). 

Data-containing record 201 is similar to container record 200, except that it 
contains two additional fields: 1 
o format 204; and 
o data 205. 
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Format 204 is a one-byte field that specifies an attribute type that is used by 
the application program 106 in interpreting the stored data. Thus, the system pro- 
vides for automated translation of individual records from one data format to 
another. Required translation operations may be provided in the code point 
5 library. 

Data 205 is a variable-length field that contains the stored data. 

For data-containing records, the length field 203 specifies the length of the 
record itself, and therefore determines the length of data field 205. As stated 
above, the length of a data-containing record never extends beyond the end of the 
10 record itself. 
Code Points 

In the preferred embodiment, code points at various levels may be as- 
sociated with different application programs 106. Thus, a typical level one record 
might consist of a data-containing record representing an outer envelope for an 

15 electronic mail message to be processed by an electronic mail application, while a 
series of level two records within the level one record might consist of additional 
container records, each identifying a particular application program 106 to be 
used to process the higher-level records within. Any records of level three or 
greater would be passed to the individual application programs 106 for further 

20 processing. 

This technique permits the same code point to be reused at various levels 
of the data structure. For example, a code point of 5DF at level 3 can be dis- 
tinguished from a code point of 5DF at level 4, because the code points will be 
interpreted by different applications. 
25 In this way, the hierarchical structure provides a mechanism for defining a 

hierarchical relationship among various levels of data to be used by different 
applications. 

Other embodiments, in which the various levels of code points and records 
are handled differently, may also be used without departing from the essential 
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characteristics of the invention. 

Referring now to Figure 3, there is shown a sample string of 302 data bytes 
representing a series of hierarchically-organized records according to the present 
invention. Each byte is shown as a two-digit hexadecimal code 301, above which 
5 is shown the ASCII equivalent 302 for the byte. Each field is labeled to indicate its 
contents: "CP=CONT" is a container code point; "CP=DATA" is a data code 
point; "LEN" is a Length field; "FMT" is a Format field; and "DATA" is a data 
field. In the code points shown, the first byte is CO for container records or 60 for 
data-containing records, while the second byte represents an application-defined 
10 code. Format 02 indicates ASCII data, while formats 04 and 05 indicate other 
types of data. Figure 3 also includes horizontal lines 303 indicating the 
hierarchical relationships among the records. 
Operations 

Given the above data structure, the following operations are available: 

15 Delete 

Referring now to Figure 4, the process of deleting a record is illustrated. To 
delete a record, including all the records it contains, the system rewrites the entire 
data structure, leaving out the records to be deleted. This is done by setting a 
pointer to the beginning of the data structure 402, and copying each byte of the 

20 data structure 404. Before copying each byte, the system checks 403 to see if the 
pointer is at a record that is to be deleted. If so, the system reads 405 the length 
field of the record and advances the pointer 406 by a number of bytes equal to the 
length of the record. After skipping forward, the system resumes copying suc- 
cessive bytes. Thus, the record and all the records it contains, are left out of the 

25 new data structure. When the system reaches the end of a container record 407, it 
reduces 408 the length field of the container record by an amount equal to the 
total length of all deleted records that were within the container record. The 
system keeps track of this total length in a stack called ICPI (Internal Code Points 
Information) (not shown). The system then checks 409 whether the end of the 
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data structure was reached. If not, it repeats steps 403 through 409. 

This process preserves the hierarchical relationships among surviving 
records when a record is deleted. For example, in Figure 6, if one of the district 
records 605 were to be deleted (including all records contained therein), the length 
5 field of parent region record 602 would be reduced by 87 (the length of the 

deleted record). In turn, the length field of parent company record 600 would be 
reduced by 87. The length fields of all other records would remain unchanged, as 
no records within those records have been deleted. 
Add 

10 Referring now to Figure 5, the process of adding a new record to the end of 

the data structure is illustrated. First the system sets the pointer 502 to the begin- 
ning of the data structure. Then it reads 503 the length of the first record in the 
structure, a level 0 record. The system then advances the pointer 504 by a number 
of bytes equal to the length. Thus, the pointer is positioned at the end of the data 

15 structure defined by the length of the first level 0 record. Once the pointer is at 
the end, the system writes 505 the new data starting at the pointer's location. As 
the new data is written, the system keeps track of the length of the written data, 
using temporary storage in the ICPI stack. After the new data is written, the 
length of the level 0 record is increased by the length of the newly written data 

20 506. Thus, the level 0 record now contains the newly written data. 

Although the above method was described in terms of adding data at level 
1 of a data structure (so that level 0 contains the new data), the same technique 
could be used at any level. Once the new data is written, the system increases the 
value of the length records for all records containing the new data record. 

25 Insert 

Referring now to Figure 8, the process of inserting a new record into the 
data structure is illustrated. To insert a record, the system rewrites the entire data 
structure, with the new record inserted. This is done by setting a pointer to the 
beginning of the data structure 802, and copying each byte of the data structure 
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807 while successively advancing the pointer. When the pointer reaches the point 
at which the new record is to be inserted 803, it copies the new record 804. The 
code point for the new record is copied from the code point of an adjacent record 
having the same level. Alternatively, a new code point may be specified by the 
5 application commanding the insertion operation. When the system reaches the 
end of a container record 808, it increases 809 the length field of the container 
record by an amount equal to the total length of all records that were inserted into 
the container record. The system keeps track of this total length in a stack called 
ICPI (Internal Code Points Information) (not shown). The system then checks 810 
10 whether the end of the data structure was reached. If not, it repeats steps 803 
through 809. 

This process preserves the hierarchical relationships among records when a 
record is inserted. For example, in Figure 6, if one of the district records 605 were 
to be deleted (including all records contained therein), the length field of parent 
1 5 region record 602 would be reduced by 87 (the length of the deleted record). In 
turn, the length field of parent company record 600 would be reduced by 87. The 
length fields of all other records would remain unchanged, as no records within 
those records have been deleted. 
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Claims 

What is claimed is: 

1. A data management system, comprising: 
a processor; 

5 an input device coupled to the processor, f or accepting input; 

an output device coupled to the processor, for generating output; and 
a storage area coupled to the processor, for storing a plurality of records 
including a plurality of container records and a plurality of data- 
containing records, wherein: 
10 each container record is hierarchically linked to at least one record • 

and comprises a length field specifying a length; and 
each data-containing record comprises: 

a length field specifying a length; and 
a data field containing data. 

15 2. The data management system of claim 1, wherein: 

each record is stored in a corresponding portion of the storage area; and 
for each container record: 

the length field defines a portion of the storage area corresponding 
to a subset of the records; and 
20 the container record is hierarchically linked to the subset. 

3. The data management system of claim 1, wherein each record further 
comprises a code point field specifying a type for the record. 

4. The data management system of claim 1, wherein each data-containing 
record further comprises a format field specifying a data format for the data field. 

25 5. The data management system of claim 1, wherein the plurality of 

records is stored sequentially in the storage area. 
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6. A computer-implemented method of managing data, comprising the 
steps of: 

storing a plurality of container records, each comprising a length field 

specifying a length, the length field identifying a hierarchical link 
5 between the container record and at least one record; and 

storing a plurality of data-containing records, each comprising a length 
field specifying a length and a data field containing data. 

7. The computer-implemented method of managing data of claim 6, 
wherein the length field identifies a portion of the storage area corresponding to a 

10 subset of the records, and wherein: 

the step of storing a plurality of container records includes, for each 

container record, storing the container record in a corresponding 
portion of a storage area; and 
the step of storing a plurality of data-containing records includes, for each 
15 data-containing record, storing the data-containing record in a 

corresponding portion of the storage area. 

8. The computer-implemented method of managing data of claim 6, 
wherein: 

the step of storing a plurality of container records includes, for each 
20 . container record, storing a code point field specifying a type for the 

container record; and 
the step of storing a plurality of data-containing records includes, for each 

data-containing record, storing a code point field specifying a type 

for the data-containing record. 

25 9. The computer-implemented method of managing data of claim 6, 

wherein the step of storing a plurality of data-containing records includes , for 
each data-containing record, storing a format field specifying a data format for the 
data field. 
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10. The computer-implemented method of managing data of claim 6, 
wherein the plurality of records is stoifed sequentially in the storage area. 

11. A computer-implemented method of deleting a record in a data 
structure of a data management system, the data structure including a plurality of 

5 container records and a plurality of data-containing records, each record 

including a length field specifying a length, each data-containing record including 
a data field containing data, each container record having a hierarchical link to at 
least one record, the method comprising the steps of: 
(a) selecting a record to be deleted; 

10 (b) setting a pointer to indicate the beginning of the data structure; 

(c) determining whether the pointer indicates the selected record; 

(d) responsive to the pointer indicating the selected record, performing the 

substeps of: 

(d.l) reading the length field of the selected record; and 
15 (d.2) advancing the pointer according to the contents of the length 

field; 

(e) responsive to the pointer not indicating the selected record, copying the 

record indicated by the pointer and advancing the pointer; 

(f) determining whether the pointer indicates the end of a container record; 
20 (g) responsive to the pointer indicating the end of a container record 

having a hierarchical link to the selected record, reducing the length 
field of the container record by an amount equal to the length of the 
selected record; 

(h) determining whether the pointer indicates the end of the data structure; 

25 and 

(i) responsive to the pointer not indicating the end of the data structure, 

repeating steps (c) through (h). 
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12. A computer-implemented method of adding a new record to a data 
structure of a data management system, the data structure including a plurality of 
container records and a plurality of data-containing records, each record 
including a length field specifying a length, each data-containing record including 

5 a data field containing data, each container record having a hierarchical linked to 
at least one record, the method comprising the steps of: 

(a) reading a length field of a first container record; 

(b) advancing a pointer according to the contents of the first container 

record; 

10 (c) writing the new record at the location indicated by the pointer; and 

(d) increasing the value of the length field of the first container record by 
an amount corresponding to the length of the new record. 

13. A computer-implemented method of inserting a new record in a data 
structure of a data management system, the data structure including a plurality of 

15 container records and a plurality of data-containing records, each record 

including a length field specifying a length, each data-containing record including 
a data field containing data, each container record having a hierarchical link to at 
least one record, the method comprising the steps of: 

(a) selecting an insertion point within the data structure for the new record; 

20 (b) setting a pointer to indicate the beginning of the data structure; 

(c) determining whether the pointer indicates the insertion point; 

(d) responsive to the pointer indicating the insertion point, copying the 

new record; 

(e) responsive to the pointer not indicating the insertion point, copying the 
25 record indicated by the pointer and advancing the pointer; 

(f) determining whether the pointer indicates the end of a container record; 

(g) responsive to the pointer indicating the end of a container record 

having a hierarchical link to the new record, increasing the length 
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field of the container record by an amount equal to the length of the' 
new record; 1 
(h) determining whether the pointer indicates the end of the data structure; 
and 

5 (i) responsive to the pointer not indicating the end of the data structure, 

repeating steps (c) through (i). 
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